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REMARKS 

Reconsideration and withdrawal of the rejections set forth in the Office Action mailed April 
30, 2009 are respectfully requested in view of this response and amendment. Claims 1-12, 14-17 and 
19 are pending in this application. Claims 1-12 and 14-18 have been rejected. 

Claims 1,14 and 15 have been amended for the sole reason of advancing prosecution. The 
claims have been amended to correct typographical errors and otherwise conform the claims to U.S. 
patent practice. Claim 15 has been amended to rewrite the claim in independent form. Claim 18 has 
been canceled without prejudice or disclaimer to the subject matter therein. Applicants, by amending 
or canceling any claims, make no admission as to the validity of any rejection made by the Examiner 
against any of these claims. Applicants reserve the right to reassert the original claim scope of any 
claim, in a continuing application. 

Claim 19 is newly presented. The subject matter of newly presented claim 19 is supported 
throughout the specification, claims and figures as originally filed, at least by original claims 1 and 
15, and Figs. 1C, 9 and IOC. 

Support for the claims as amended appears throughout the specification, claims and figures 
as originally filed. It is respectfully submitted that the amendments do not introduce any new matter 
within the meaning of 35 U.S.C. §132. 

In view of the following, further and favorable consideration is respectfully requested. 

Claim Objections 

The Examiner has objected to claims 1 and 1 8 as being duplicated claims, and objected to the 
claim marking of claim 4 in the Response filed February 19, 2009. 
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Response 

By this Response and Amendment, claim 18 has been canceled without prejudice or 
disclaimer. Applicants present the remaining pending claims with proper markings with respect to 
the status of the claims. Accordingly the objections are moot; Applicants request reconsideration and 
withdrawal of these objections. 

Claim Rejections under 35 U.S.C §101 

The Examiner has rejected claims 1-12 and claims 15-17 under 35 U.S.C §101 as being 
directed to non- statutory subject matter. 

Response 

Independent claims 1 and 15 have been amended, and as amended, these rejections are 
respectfully traversed. Claims 2-12, 16 and 17 have been amended, or depend from amended claims, 
and as amended, these rejections are respectfully traversed. Claims 1 and 15 have been amended to 
recite a "computer-implemented method" and to stipulate that the method steps are performed by a 
computer, thereby tying the method to another statutory class. Amended claims 1 and 15 are now 
presented in similar form as previously presented claim 18 (now canceled), which Applicants note 
was not rejected under 35 U.S.C §101 in the Office Action. 

Accordingly, Applicants submit that all of the claims, including claims 1-12 and 15-17 are 
directed to statutory subject matter, as defined in 35 U.S.C. § 1 01 and within the interpretation provided 
under Supreme Court precedent and Federal Circuit decisions. Therefore, Applicants respectfully submit 
that the rejections under 35 U.S.C. §101 should be withdrawn. 
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Claim Rejections under 35 U.S.C §103(a) 
The Examiner has rejected claims 1-4 and 15-18 under 35 U.S.C. §103(a) as being 
unpatentable over U.S. Patent Application Publication No. 2002/0042793 to Choi (hereinafter 
referred to as "Choi") in view of U.S. Patent No. 4,839,853 to Deerwester et al. (hereinafter referred 
to as "Deerwester et al.") and rejected claims 5-12 and 14 as being unpatentable over Choi, in view 
of Deerwester et al. and further in view of U.S. Patent No. 6,128,613 to Wong et al. (hereinafter 
referred to as "Wong et al."). 

Response 

Claim 18 has been canceled without prejudice or disclaimer to the contents; accordingly the 
rejection of claim 18 is moot. Applicants respectfully traverse the remaining rejections since all of 
the features of the presently claimed subject matter are not disclosed by the cited references. 

To establish a prima facie case of obviousness, the Examiner must establish: (1) some 
suggestion or motivation to modify the references exists; (2) a reasonable expectation of success; 
and (3) the prior art references teach or suggest all of the claim limitations. Amgen, Inc. v. Chugai 
Pharm. Co., 18 USPQ2d 1016, 1023 (Fed. Cir. 1991); In re Fine, 5 USPQ2d 1596, 1598 (Fed. Cir. 
1988); In re Wilson, 165 USPQ 494, 496 (CCPA 1970). 

A prima facie case of obviousness must also include a showing of the reasons why it would be 
obvious to modify the references to produce the present invention. See Dy star Textilfarben GMBHv, C. 
H. Patrick, 464 F.3d 1356 (Fed. Cir. 2006). The Examiner bears the initial burden to provide some 
convincing line of reasoning as to why the artisan would have found the claimed invention to have been 
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obvious in light of the teachings. Id. at 1366. 



Overview 

Independent claim 1 recites: 

A computer-implemented method of determining cluster attractors for use in 
clustering a plurality of documents, each document comprising at least one term, each 
term comprising one or more words, the method comprising: causing a computer to 
calculate, in respect of each term, a probability distribution that is indicative of the 
frequency of occurrence of one other term in the instance where a document comprises 
said term and said one other term, and in the instance where a document comprises said 
term and more than one other term, the respective frequency of occurrence of each other 
term, that co-occurs with said term in at least one of said documents; causing a computer 
to calculate, in respect of each term, the entropy of the respective probability distribution; 
causing the computer to select at least one of said probability distributions as a cluster 
attractor depending on the respective entropy value. 

These features are not described nor rendered obvious by Choi or Deerwester et al., whether taken 

alone or in combination. Claims 2-12 depend directly or indirectly from claim 1. 

Independent claim 15 recites: 

An apparatus for determining cluster attractors for use in clustering a plurality of 
documents, each document comprising at least one term, each term comprising one or 
more words, the apparatus comprising: means for calculating, in respect of each term, a 
probability distribution that is indicative of the frequency of occurrence of one other term 
in the instance where a document comprises said term and said one other term, and in the 
instance where a document comprises said term and more than one other term, the 
respective frequency of occurrence of each other term, the entropy of the respective 
probability distribution; and means for selecting at least one of said probability 
distributions as a cluster attractor depending on the respective entropy value. 

These features are not described nor rendered obvious by Choi or Deerwester et al. or Wong et al., 

whether taken alone or in combination. 

Independent claim 15 recites: 

A computer-implemented method of clustering a plurality of documents, each 
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document comprising at least one term, each term comprising one or more words, the 
method comprising causing a computer to calculate, in respect of each term, a probability 
distribution that is indicative of the frequency of occurrence of one other term in the 
instance where a document comprises said term and said one other term, and in the 
instance where a document comprises said term and more than one other term, the 
respective frequency of occurrence of each other term, that co-occurs with said term in at 
least one of said documents; causing a computer to calculate, in respect of each term, the 
entropy of the respective probability distribution; causing the computer to select at least 
one of said probability distributions as a cluster attractor depending on the respective 
entropy value; causing the computer to compare each document with each cluster 
attractor; and causing the computer to assign each document to one or more cluster 
attractors depending on the similarity between the document and the cluster attractors. 

These features are not described nor rendered obvious by Choi or Deerwester et al, whether taken 

alone or in combination. Claims 16 and 17 depend directly from claim 15. 



Choi relates to "[a] method of order-ranking document clusters using entropy data and 
Bayesian self-organizing feature maps(SOM) is provided in which an accuracy of information 
retrieval is improved by adopting Bayesian SOM for performing a real-time document clustering for 
relevant documents in accordance with a degree of semantic similarity between entropy data 
extracted using entropy value and user profiles and query words given by a user, wherein the 
Bayesian SOM is a combination of Bayesian statistical technique and Kohonen network that is a 
type of an unsupervised learning." See Choi Abstract. 

Deerwester et al. relates to "[a] methodology for retrieving textual data objects. ... The 
information is treated in the statistical domain by presuming that there is an underlying, latent 
semantic structure in the usage of words in the data objects. Estimates to this latent structure are 
utilized to represent and retrieve objects. A user query is recouched in the new statistical domain and 
then processed in the computer system to extract the underlying meaning to respond to the query." 
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See Deerwester et al. Abstract. 

Wong et al. relates to "[a] computer-based method and system for establishing topic words to 
represent a document, the topic words being suitable for use in document retrieval. The method 
includes determining document keywords from the document; classifying each of the document 
keywords into one of a plurality of preestablished keyword classes; and selecting words as the topic 
words, each selected word from a different one of the preestablished keyword classes, to minimize a 
cost function on proposed topic words. The cost function may be a metric of dissimilarity, such as 
cross-entropy, between a first distribution of likelihood of appearance by the plurality of document 
keywords in a typical document and a second distribution of likelihood of appearance by the 
plurality of document keywords in a typical document, the second distribution being approximated 
using proposed topic words. The cost function can be a basis for sorting the priority of the 
documents." See Wong et al. Abstract. 

Clarification regarding the claims (as previously presented) 

With regard to the Examiner's arguments in paragraph 12 (of page 4) of the Office Action 
concerning claim 1, Applicants respectfully note that the Examiner has mistakenly introduced an 
extra word into claim 1 that has seemingly caused the Examiner to break claim 1 down into separate 
features in a manner that distorts claim 1. Specifically, the Examiner has asserted one of the features 
of claim 1 as being "calculating, in respect of each term, a probability distribution," and a 
subsequent feature of claim 1 as being "and indicative of the frequency of occurrence of one other 
term in the instance where a document comprises said term and said one other term, and in the 
instance where a document comprises said term and more than one other term, the respective 
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frequency of occurrence of each other term" (emphasis added, the bolded word "and" does not 
appear in claim 1). However, claim 1 recites "calculating, in respect of each term, a probability 
distribution indicative of the frequency of occurrence of one other term in the instance where a 
document comprises said term and said one other term, and in the instance where a document 
comprises said term and more than one other term, the respective frequency of occurrence of each 
other term, that co-occurs with said term in at least one of said documents" (emphases added) from 
which it can be seen that the phrase "indicative of the frequency... that co-occurs with said term in at 
least one of said documents" is a definition of the recited probability distribution. As such, it is 
respectfully submitted that these two phrases cannot be split into two separate features without 
rendering each phrase incomplete and consequently distorting claim 1 . To clarify this issue, 
Applicant has introduced the phrase "that is" between "distribution" and "indicative" in claim 1 and 
the independent claims 14, 15 and 19. 

Similarly, the Examiner has combined the phrase "that co-occurs with said term in at least 
one of said documents" recited in claim 1 with the feature "calculating, in respect of each term, the 
entropy of the respective probability distribution," although it can be seen from the preceding 
paragraph that the phrase "that co-occurs with said term in at least one of said documents" is also 
part of the definition of the probability distribution. 

Applicants also respectfully note that although the Examiner asserts at the top of page 5 of 
the Office Action that Choi discloses the claim features of "and indicative of the frequency of 
occurrence of one other term in the instance where a document comprises said term and said one 
other term, and in the instance where a document comprises said term and more than one other term, 
the respective frequency of occurrence of each other term," the Examiner also asserts midway down 
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page 5 of the Office Action that "Choi does not specifically discloses [sic]: and indicative of the 
frequency of occurrence of one other term in the instance where a document comprises said term and 
said one other term, and in the instance where a document comprises said term and more than one 
other term" (bold emphasis added). Consequently, Applicants are not certain as to which features of 
claim 1 the Examiner is asserting as being described by Choi. If the rejection is to be maintained, 
Applicants respectfully request clarification in this regard. 

Therefore, Applicants respectfully request that the Examiner take the foregoing submissions 
into account when considering the amended claims in light of the following comments. 

Rejection of claims 1-4 and 15-18 

Applicants respectfully submit that Deerwester et al. fails to cure the deficiencies of Choi 
with respect to the claimed subject matter in accordance with Applicants' independent claims 1 and 
15, and further, does not suggest a teaching or motivation to reach such subject matter as claimed in 
the instant application. Applicants further respectfully submit that the cited prior art of record does 
not suggest a teaching or motivation to reach such subject matter as claimed in the instant 
application. 

With regard to claim 1 , Applicants disagree with the Examiner's assertion that Choi discloses 
"a. . . method of determining cluster attractors for a plurality of documents," as recited in the present 
claims. In contrast to the claimed subject matter, Choi does not use cluster attractors when 
clustering documents. Instead, Choi clusters documents by comparing documents to each other and 
putting documents that have a high similarity in the same cluster. See Choi paragraphs [0109] and 
[01 10]; for example, "to group individual documents, a measure for clustering documents is needed. 
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As a measurement, similarity and dissimilarity between documents is used. Here, if similarity 
between documents is employed as a measurement, documents having relatively higher similarity 
are classified into the same group. If dissimilarity is employed, documents having relatively lower 
dissimilarity are classified into the same group. The most fundamental method employing 
dissimilarity between documents is to use distance between documents" (Choi explains the specific 
manner in which he clusters documents by measuring their similarity/dissimilarity to one another in 
Choi paragraphs [01 1 1]-[0121]). 

The Examiner has cited Choi paragraph [0126], but this paragraph relates to computing the 
distance between clusters after the clusters have been formed. Therefore this paragraph does not 
relate to cluster attractors, which are used to form the clusters and accordingly must be calculated 
before the clusters are formed. The Examiner also refers to Choi paragraphs [0132] - [0134]. 
However, Choi merely describes the centroid linkage method which is a method of measuring the 
distance between two clusters that have already been formed - see for example Choi paragraph 
[0133] "as a distance between the two clusters cl and c2, the distance between the centroids of the 
two clusters is used" - i.e. the "centroid linkage method" referred to by Choi is performed after the 
clusters have been formed and therefore does not relate to cluster attractors, which are required 
before the clusters are formed. In this connection, the "centroid" mentioned by Choi in paragraphs 
[0133] and [0134] can only be calculated after the cluster is formed (see for example the equation 
given in paragraph [0133] in which centroids are calculated by using data relating to the documents 
in the clusters that have already been formed). Therefore, the centroid is not the same as a cluster 
attractor within the meaning of claim 1, since a cluster attractor is used to form the clusters and 
accordingly must be identified before the cluster is formed. To help clarify this distinction, 
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Applicants have introduced the phrase "for use in clustering" into claim 1 . Applicants also 
respectfully remind the Examiner of the discussions during the Interview conducted December 5, 
2008 in which distinction was made between a centroid and a cluster attractor; Applicants again 
refer the Examiner to Fig. 3 of the instant application, which distinguishes between a centroid and a 
cluster attractor. 

Applicants also respectfully disagree that Choi teaches to "calculate, in respect of each term, 
a probability distribution." In this connection, the Examiner cited Choi paragraphs [0028] and 
[0029]. However, Choi paragraph [0029] describes "prior information" in "the form of probability 
distribution" and it can be seen from Choi paragraph [0027] that this "prior information" is an initial 
value for a neural network. An initial value for a neural network has nothing to do with the terms, 
e.g. words, of the document and so in contrast to that recited in claim 1, Choi does not describe that 
a probability distribution is calculated "in respect of each term." 

In particular, Choi does not describe the particular probability distribution recited in claim 1, 
namely a ''probability distribution that is indicative of the frequency of occurrence of one other term 
in the instance where a document comprises said term and said one other term, and in the instance 
where a document comprises said term and more than one other term, the respective frequency of 
occurrence of each other term that co-occurs with said term in at least one of said documents" 
(emphases added). As mentioned above, in contrast to claim 1, the probability distribution referred 
to by Choi in paragraph [0029] has nothing to do with the terms of the document and therefore has 
nothing to do with the frequency of occurrence and/or co-occurrence of terms within documents. 

The Examiner has also cited Choi paragraph [0060] in connection with this part of claim 1 . 
However, Choi paragraph [0060] simply states that documents can be clustered by grouping 
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documents with similar contents into the same cluster by a document clustering technique. As 
discussed above, this may be regarded as teaching away from claim 1 since claim 1 is concerned 
with determining cluster attractors for clustering documents, whereas Choi, as demonstrated in 
paragraph [0060], clusters documents by comparing their similarity with each other. 

Applicants also respectfully disagree that Choi teaches to "calculate, in respect of each term, 
the entropy of each of the respective probability distribution. . . select at least one of said probability 
distributions as a cluster attractor depending on the respective entropy value," recited in claim 1. In 
this connection, the Examiner cited Choi paragraphs [0051] to [0053]. It can be seen from Choi 
paragraph [0052] (and paragraph [0028]) that Choi only describes calculating an entropy value for 
"query words given by user profiles with respect of key words for each of the documents." 
Therefore, Choi does not teach to calculate the entropy value of a probability distribution of any 
type, and particularly not a probability distribution of the type recited in claim 1. Accordingly, Choi 
cannot "select at least one of said probability distributions as a cluster attractor depending on the 
respective entropy." Firstly, because he does not use cluster attractors (instead documents are 
compared directly with one another) and secondly, because he does not calculate entropy values of 
any probability distributions. 

Therefore, it is respectfully submitted that claim 1 is novel over Choi at least because of the 
above-discussed features. 

With regard to Deerwester et al., Applicants respectfully submit that although Table 2 of 
Deerwester et al. describes nine technical document titles in a "term-by-document" matrix in which 
the frequency of occurrence of term in document j is recorded, Deerwester does not describe any of 
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the features cited in claim 1 . Firstly, Deerwester et al. is not concerned with document clustering 
and so does not require cluster attractors or describe any method of determining cluster attractors. 
Secondly, Deerwester et al. does not teach to "calculate, in respect of each term, a probability 
distribution that is indicative of the frequency of occurrence of one other term in the instance where 
a document comprises said term and said one other term, and in the instance where a document 
comprises said term and more than one other term, the respective frequency of occurrence of each 
other term occurs with said term in at least one other said documents" as presently claimed. In this 
connection, it is noted in particular that Table 2 of Deerwester et al. describes actual frequencies of 
occurrence of the terms and documents and not probability distributions that are indicative of the 
frequency of occurrence of terms, as per claim 1 . Further, because Deerwester et al. does not teach 
probability distributions, it cannot teach to calculate the entropy of the probability distributions. 
Also, because it does not use cluster attractors, Deerwester. et al. cannot teach to "select a probability 
distribution as a cluster attractor depending on its respective entropy value" as presently claimed. 

Accordingly, since neither Choi nor Deerwester et al. describe any of the above-discussed 
features of claim 1, it is respectfully submitted that their combined teachings would also not lead a 
person having ordinary skill in the art to subject matter of claim 1. Moreover, Choi and Deerwester 
et al. are not technically compatible with one another - Choi relates to document clustering while 
Deerwester et al. relates to retrieving textual data objects by user query by mapping all of the terms 
and documents from the document corpus into a semantic space. See Deerwester et al. col. 2, lines 
24-40. Applicants respectfully submit that a person having ordinary skill in the art would not 
combine the teachings of these references, at least because they address entirely different issues. 
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Therefore, it is respectfully submitted that claim 1 is not obvious over the combined teachings of 
Choi and Deerwester et al. 

With regard to claim 2, Applicants respectfully disagree that Choi describes a probability 
distribution within the meaning of claim 2 for the reasons stated above, namely that the probability 
distribution described by Choi relates only to the "prior information" to be used as an initial value 
for a neural network and not the frequency of occurrence of terms as recited in claim 1 (which 
definition of the probability distribution also applies to claim 2 which is dependent on claim 1). 

Applicants respectfully disagree that Deerwester et al. describes any of the features of claim 
2. In particular, Deerwester et al. does not describe making any calculations with respect to "co- 
occurring" terms as per claim 2. Deerwester et al. does not mention co-occurrence at all. The 
matrix shown in Table 2 and referred to in the passages at column 3, lines 34-65 and column 4, lines 
1-9 as identified by the Examiner relate only to the frequency of occurrence of term i in document j 
and not at all to the co-occurrence of the terms in documents. In this connection, the Examiner also 
refers to column 10 in Table 6 of Deerwester et al. arguing that co-occurrence is demonstrated. 
Applicants respectfully disagree. In Table 6 Deerwester et al. does not present any data about term 
co-occurrence. Table 6 instead shows the results of a search where the query describes the user's 
information needs (for example looking for experts in "automatic representation of semantic 
structure" within a company) and the results list is a set of relevant research for this query. The 
numeric values in this table show how relevant a specific research group is to the query. 

Therefore, Applicants respectfully submit that neither Choi nor Deerwester et al. teach, show 
or suggest the features of claim 2 and as such claim 2 is novel, non-obvious and consequently 
patentable, when compared to Choi and Deerwester et al. individually or combined. 
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With regard to Claim 3, the Applicants respectfully disagree that Choi describes a 
conditional probability distribution within the meaning of claim 3 for the reasons stated above, 
namely that the probability distribution described by Choi relates only to the "prior information" to 
be used as an initial value for a neural network and not the frequency of occurrence of terms as 
recited in claim 1 (which definition of the probability distribution also applies to claim 1, which is 
ultimately dependent on claim 1). 

The Applicants respectfully disagree that Deerwester et al. describes any of the features of 
claim 3, in particular Deerwester et al. does not describe making any calculations with respect to 
"co-occurring" terms as per claim 3. Deerwester et al. does not mention co-occurrence at all. The 
matrix shown in Table 2 and referred to in the passages at column 3, lines 34-65, and column 4, lines 
1-9 as identified by the Examiner relate only to the frequency of occurrence of term i in document j 
and not at all to the co-occurrence of terms in documents. In this connection the Examiner also 
refers to column 10 in table 6 of Deerwester et al. arguing that co-occurrence is demonstrated. 
Applicants respectfully disagree. In Table 6 Deerwester et al. does not present any data about term 
co-occurrence. Table 6 instead shows the results of a search where the query describes the user's 
information needs (for example looking for experts in "automatic representation of semantic 
structure" within a company) and the results list is a set of relevant research for this query. The 
numeric values in this table show how relevant a specific research group is to the query. 

Therefore, Applicants respectfully submit that neither Choi nor Deerwester et al. describe the 
feature of claim 3 and as such claim 3 is both novel and non-obvious when compared to Choi and 
Deerwester et al. individually or combined. 
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With regard to claim 4, as indicated above with reference to claim 2, neither Choi nor 
Deerwester et al. describe an indicator as defined in claim 2, or an indicator in respect of each co- 
occurring term. It follows that neither Choi nor Deerwester et al. describes the indicator of claim 4, 
since claim 4 incorporates all of the features of Claim 2. In particular, Applicants can find no 
reference in Deerwester et al. or Choi to the normalization of indicators with respect to the total 
number of terms in a document. 

With regard to claim 15, claim 15 includes all of the features of claim 1 and has been 
amended to recited these explicitly. Therefore, it is respectfully submitted that claim 15 is novel and 
non-obvious over the individual combined teachings of Choi and Deerwester et al. for the reasons 
set out above in relation to claim 1 . 

Claim 1 5 additionally recites the features "causing the computer to compare each document 
with each cluster attractor; and causing the computer to assign each document to one or more 
cluster attractors depending on the similarity between the document and the cluster attractors'' 
Choi does not describe these additional features of claim 15. This is because, as described above in 
relation to claim 1, Choi does not use cluster attractors when forming its clusters. Instead, Choi 
forms clusters by comparing documents to each other (see paragraph [0109] and [0110]. 

In connection with claim 1 5, the Examiner has highlighted the Abstract of Choi. It is 
respectfully submitted however that the Choi Abstract has no relevance to claim 15 because claim 
15 recites a computer implemented method of clustering a plurality of documents, i.e. claim 15 
relates to the formation of clusters. In contrast, the Choi Abstract describes "a method of order 
ranking document clusters", i.e. the Choi Abstract relates only to what happens after the clusters are 
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formed. As indicated previously, Choi describes a method of clustering, but this is achieved by 
comparing the similarities/dissimilarities between documents rather than determining cluster 
attractors in the manner defined in claim 1 5 and then comparing the documents to the cluster 
attractors as per claim 15. It is respectfully submitted that claim 15 is novel and non-obvious over 
the individual and combined teachings of Choi and Deerwester et al. 

With regard to claim 16, Applicants respectfully disagree that Choi describes the features of 
claim 16. For the reasons given above in relation to claim 1, Choi does not describe probability 
distributions that are indicative of the frequency of occurrence of each term in a document, nor does 
he compare respective probability distributions of each document with each probability distribution 
selected as a cluster attractor. As explained previously, Choi does not use cluster attractors and the 
"centroid linkage" described by Choi cannot be regarded as a cluster attractor for the reasons set 
above in relation to claim 1, namely that Choi's centroid linkage method is a method of measuring 
the distance between clusters that have already been formed. Further, with regard to Deerwester et 
al., as explained above in relation to claim 1, although Deerwester et al. describes a Table recording 
the frequency of occurrence in terms of a document, these are not presented as probability 
distributions so Deerwester et al. does not provide any of the features of claim 16. 

With regard to claim 17, Applicants respectfully disagree with the Examiner that Choi 
describes organizing the documents within each cluster. In this connection, the Examiner refers to 
Choi's title, but this relates to organizing the clusters themselves and not the documents within the 
clusters as per claim 17. Although Choi describes assigning weights, paragraph [0012] of Choi only 
describes assigning weights to each indexable word of extracted documents based on the number of 
occurrences of the inverted document. This is not the same as assigning a respective weight to each 
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document, the value of the weight depending on the similarity between the probability distribution 
of the document and the probability distribution of the cluster attractor (not least because Choi does 
not describe cluster attractors as outlined above). For similar reasons, Choi does not describe 
assigning a respective weight to each pair of compared documents, the value of the weight 
depending on the similarity between the compared respective probability distributions of each 
document of the pair. Further, Choi does not describe the calculation of a minimum spanning tree 
for the cluster based on respective calculated weights as per claim 17. In this connection, the 
Examiner refers to paragraph [0098]. However, this paragraph only refers to suffix tree clustering 
(STC), which is a clustering algorithm and has nothing to do with organizing documents within each 
cluster attractor and moreover is not the same as a minimum spanning tree for a cluster, i.e. the STC 
algorithm is for forming clusters and not for organizing clusters. 

Rejection of claims 5-12 and 14 

Applicants respectfully submit that Wong et al. fails to cure the deficiencies of Choi and 
Deerwester et al. with respect to the claimed subject matter in accordance with Applicants' 
independent claims 1 and 14, and further, does not suggest a teaching or motivation to reach such 
subject matter as claimed in the instant application. Applicants further respectfully submit that the 
cited prior art of record does not suggest a teaching or motivation to reach such subject matter as 
claimed in the instant application. 

With regard to claim 5, Applicants respectfully disagree with the Examiner that Choi, Deerwester et 
al. and Wong et al. together describe the features of claim 5. Although Wong et al. describes 
extracting a sub-set of words within a document, such extraction does not relate to selecting a 
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probability distribution as a cluster attractor as per claim 5 since neither Choi, nor Deerwester et al., 
nor Wong et al. use cluster attractors. With regard to claim 6, which is dependent on claim 5, the 
same comments apply as were made in relation to claim 5. Therefore, Applicants submit that claims 
5 and 6 are novel, non-obvious and consequently patentable over Choi, Deerwester et al. and Wong 
et al., whether taken alone or in combination. 

With regard to claim 7, the Examiner refers to paragraph [0105] of Choi, which mentions 
entropy. As discussed above in relation to claim 1 , Choi only calculates an entropy value between 
key words of each web document and the user's query word and user profile (see for example 
paragraph [0028]). In contrast, claim 7 relates to an entropy value of a probability distribution of 
one or more terms from each sub-set (the probability distribution itself is further defined by claim 1 
on which claim 7 depends indirectly). Therefore, Choi does not describe the calculation or use of 
the entropy of a probability distribution, or the specific probability distribution defined in claim 1 
and recited in claim 7. Further, neither Choi, nor Deerwester et al., nor Wong et al., describe cluster 
attractors. Although Wong et al. describes a "satiable threshold" at col. 7, line 3, this threshold 
relates to a threshold for a weight assigned to an arc between two nodes of a word network. See 
Wong et al. col. 6, lines 64-66. Therefore, the threshold of Wong et al. does not relate to the subset 
of Wong et al. and so Wong et al. does not describe a threshold assigned to a subset, and particularly 
not an entropy threshold assigned to a sub set of terms. Therefore claim 7 is novel and non-obvious 
when compared to the combined teachings of Choi, Deerwester et al. and Wong et al. 

With regard to claim 8, which depends on claim 7, the same submissions are made as were 
made in relation to claim 7. 
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With regard to claim 9, although Choi uses the word "frequencies" and contains a paragraph 
on "disjoint clustering/' Choi's frequencies relate to the frequencies of key words used in the user 
search. Choi does not teach, show or suggest associating a frequency range with a sub-set. Also, 
claim 9 does not recite "disjoint clustering." Instead it recites that the " frequency ranges for 
respective subsets are disjoint" - in contrast, Choi describes that the clusters are disjoint and not that 
frequency ranges are disjoint (notwithstanding the fact that Choi does not describe frequency 
ranges). Wong et al. also is deficient in describing associating frequency ranges with its subsets. 

With regard to claims 10, 1 1 and 12, similar comments apply as were made in relation to 
claims 5, 7 and 9. 

Therefore, it is respectfully submitted that claims 9-12 are novel, non-obvious and 
consequently patentable over Choi, Deerwester et al. and Wong et al., whether taken alone or in 
combination. 

With regard to independent claim 14, claim 14 is an apparatus claim reciting, inter alia, 
features corresponding to the features discussed above with respect to method claim 1 . Applicants 
therefore submit that claim 14 is novel, non-obvious and consequently patentable over Choi and 
Deerwester et al. (and Wong et al.) for the reasons discussed above in relation to claim 1 . 

Newly presented claim 19 

Newly presented claim 19 is an apparatus claim reciting, inter alia, features corresponding to 
the features discussed above with respect to method claim 15. Applicants respectfully submit that 
claim 19 is novel, non-obvious and consequently patentable over Choi and Deerwester et al. (and 
Wong et al.) for the reasons discussed above in relation to claim 15. 
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CONCLUSION 



In light of the foregoing, Applicants submit that the application is in condition for allowance. 
If the Examiner believes the application is not in condition for allowance, Applicants respectfully 
request that the Examiner call the undersigned. 

In the event this paper is not timely filed, Applicants hereby petition for an appropriate extension 
of time. Please charge any fee deficiency or credit any overpayment to Deposit Account No. 14-01 12. 
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